Distinctive Parts for Relative attributes

نویسنده

  • C. V. Jawahar
چکیده

Visual Attributes are properties observable in images that have human-designated names ( e.g., smiling, natural) and they are valuable as a new semantic cue in various vision problems like facial verification, object recognition, generating description of unfamiliar objects and to facilitate zero shot transfer learning etc. While most of the work on attributes focuses on binary attributes (indicating the presence or absence of attribute) the notion of relative attributes as introduced by Parikh and Grauman in ICCV 2011 provides an appealing way of comparing two images based on their visual properties than the binary attributes. Relative visual properties are a semantically rich way by which humans describe and compare objects in the world. They are necessary, for instance, to refine an identifying description (the rounder pillow; the same except bluer), or to situate with respect to reference objects (brighter than a candle; dimmer than a flashlight). Furthermore, they have potential to enhance active and interactive learning, for instance, offering a better guide for a visual search (find me similar shoes, but shinier or refine the retrieved images of downtown Chicago to those taken on sunnier days). For learning relative attributes a ranking svm based formulation was proposed that uses globally represented pairs of annotated images. In this thesis, we extend this idea towards learning relative attributes using local parts that are shared across categories. First we propose a part based representation that jointly represents a pair of images. For facial attributes, part corresponds to a block around a landmark point detected using a domain specific method. This representation explicitly encodes correspondences among parts, thus better capturing minute differences in parts that make an attribute more prominent in one image than another as compared to global representation. Next we update this part based representation by additionally learning weights corresponding to each part that denote their contribution towards predicting the strength of a given attribute. We call these weights as significance coefficients of parts. For each attribute the significance coefficients are learned in a discriminative manner simultaneously with a max-margin ranking model. Thus the best parts for predicting relative attribute more smiling will be different from those from predicting more eyes open. We compare the baseline method of Parikh and Grauman with the proposed method under various settings. We have collected a new dataset of 10000 pair wise attribute level annotations using images from labeled faces in the wild ( LFW) dataset particularly focusing on large variety of samples in terms of poses, lightning conditions etc and completely ignoring the category information while collecting attribute annotation . Extensive experiments demonstrate that the new method significantly improves prediction accuracy as compared to the baseline method. Moreover the learned parts

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interval MULTIMOORA method with target values of attributes based on interval distance and preference degree: biomaterials selection

A target-based MADM method covers beneficial and non-beneficial attributes besides target values for some attributes. Such techniques are considered as the comprehensive forms of MADM approaches. Target-based MADM methods can also be used in traditional decision-making problems in which beneficial and non-beneficial attributes only exist. In many practical selection problems, some attributes ha...

متن کامل

مدل‌سازی بازشناسی واجی کلمات فارسی

Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...

متن کامل

Classification of stop consonant place of articulation

One of the approaches to automatic speech recognition is a distinctive feature-based speech recognition system, in which each of the underlying word segments is represented with a set of distinctive features. This thesis presents a study concerning acoustic attributes used for identifying the place of articulation features for stop consonant segments. The acoustic attributes are selected so tha...

متن کامل

Extracting concept descriptions from the Web: the importance of attributes and values

When extracting information about concepts from the Web, the problem is not recall, but precision: trying to identify which properties of a concept are genuinely distinctive. We discuss a series of experiments in empirical ontology using both unsupervised and supervised methods, showing that not all semantic relations we can extract from text are equally useful, and suggesting that attempting t...

متن کامل

Joint Likelihood Methods for Mitigating Visual Tracking Disturbances

We describe a framework that explicitly reasons about data association and combines estimates to improve tracking performance in many difficult visual environments. This work extends two previously reported algorithms: the PDAF, which handles single-target tracking tasks involving agile motions and clutter, and the JPDAF, which shares information between multiple same-modality trackers (such as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014